Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

chore: update to LAPIS 0.3.4, SILO 0.2.22 #2934

Merged
merged 3 commits into from
Oct 7, 2024
Merged

Conversation

fengelniederhammer
Copy link
Contributor

@fengelniederhammer fengelniederhammer commented Oct 2, 2024

preview URL: https://lapis034silo0222.loculus.org/

LAPIS Changelog

0.3.4 (2024-10-02)

Features

  • lapis: order CSV/TSV columns of aggregated/details data as specified in the request fields or as in the database config (6cf8704), closes #910

Bug Fixes

  • lapis-docs: fix link on landing page (d4137b3)

0.3.3 (2024-09-25)

Features

  • lapis: abort on startup when silo url has invalid syntax (373d662), closes #939

Bug Fixes

  • lapis-docs: Sequence file naming scheme uses indexes now not names (1e7cd60)

0.3.2 (2024-09-10)

Features

  • lapis: add healthcheck to Docker containers (0944990), closes #813
  • lapis: add instance name to landing page json (6c5f81b)

SILO Changelog

0.2.22 (2024-10-01)

Bug Fixes

  • correctly escape quotes in field names (#595) (7e7b448)

0.2.21 (2024-09-20)

Bug Fixes

  • preprocessing: resolves spurious OOM crashes when handling large datasets (6e4eae2)

0.2.20 (2024-09-17)

Features

  • make duckdb memory limit configurable via preprocessing config (c112eb1)

0.2.19 (2024-09-12)

Bug Fixes

  • preprocessing: validate that either a ndjson or tsv file was given as input (#564) (7bf2ef7)

0.2.18 (2024-09-12)

Features

  • do not abort on assertment failures (4e5f598)
  • panic: add ASSERT, UNREACHABLE, UNIMPLEMENTED, safer env var (867d08f)
  • update duckdb version (577c1b2)

Bug Fixes

  • duplicate key in sample.ndjson (c9b5232)
  • increase duckdb memory limit to 80GB (20131bb)
  • order of randomized tests changed (d29b2f2)

Screenshot

PR Checklist

- [ ] All necessary documentation has been adapted.
- [ ] The implemented feature is covered by an appropriate test.

  • Manual testing:
    • Let cook on pathoplexus staging to see if all is good before merging here
    • Integrity check on pathoplexus staging vs prod, using snakemake test pipeline
  • Adjust configs to take advantage of new features
    • Health checks
    • TSV column order

@fengelniederhammer fengelniederhammer added preview Triggers a deployment to argocd deployment Code changes targetting the deployment infrastructure labels Oct 2, 2024
@corneliusroemer
Copy link
Contributor

@fengelniederhammer thanks! Is there anything that might break stuff for us? Can't see anything immediately looking at changelog, I assume you would have told us if there was something breaking, but better check explicitly with you :)

@theosanderson
Copy link
Member

theosanderson commented Oct 2, 2024

We should play with TSV downloads and see if our new order looks reasonable as a starting point (doesn't need to be perfect - just not really bad!)

@fengelniederhammer
Copy link
Contributor Author

There should be nothing that is relevant for Loculus, except for the "order CSV/TSV columns" feature in LAPIS.

The changes in SILO were mostly internal stuff or preprocessing issues that we had with large datasets.

@corneliusroemer
Copy link
Contributor

corneliusroemer commented Oct 2, 2024

Is there anything we could/need to do to take advantage of new features? In particular, from the changelog I'm wondering about the following:

Btw, great changelog, well done @fengelniederhammer and @Taepper on this!

We need to make sure we test this well, e.g. on staging, before merging as in the past there've been unexpected change in SILO/LAPIS. We should probably come up with a checklist for things to do whenever we upgrade SILO/LAPIS versions, including integrity checks etc.

Here is a start for the checklist, feel free to add more points (added to main PR description):
image

@fengelniederhammer
Copy link
Contributor Author

  • lapis: add healthcheck to Docker containers

Loculus could benefit, but AFAIK the LAPIS deployments already have there custom healthchecks implemented in the Helm chart. There would not be significant improvement.

  • make duckdb memory limit configurable via preprocessing config

Given the current size of datasets, this should be pretty much irrelevant. We were having issues because our datasets are in the order of magnitude of 100 GB.

Copy link
Contributor

@corneliusroemer corneliusroemer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Making a blocking review not because of known issues but because we should generally check lapis/silo bumps carefully based on past experience. See checklist in PR description for an idea of what to do.

@corneliusroemer
Copy link
Contributor

corneliusroemer commented Oct 2, 2024

lapis: add healthcheck to Docker containers

Loculus could benefit, but AFAIK the LAPIS deployments already have there custom healthchecks implemented in the Helm chart. There would not be significant improvement.

It's best to push anything to dependency if possible and only redo what's necessary. So I would welcome if you showed us how to make use of the new health check to simplify/minifiy our helm config. Not blocking, we can make an issue for this as well.

@fengelniederhammer
Copy link
Contributor Author

How do we move forward on this?

* [ ]  Manual testing:
  
  * [ ]  Let cook on pathoplexus staging to see if all is good before merging here
  * [ ]  Integrity check on pathoplexus staging vs prod, using snakemake test pipeline

I don't have access to Pathoplexus deployment infrastructure, someone else would need to deploy and test.
(Also I see quite low risk in updating to these versions as there have been very few changes with quite little impact - of course still good to test.)

* [ ]  Adjust configs to take advantage of new features
  * [ ]  Health checks

I don't see much benefit from implementing the new health check.

  * [ ]  TSV column order

That should be it's own feature - implemented in a separate issue.

@corneliusroemer
Copy link
Contributor

corneliusroemer commented Oct 7, 2024

I don't have access to Pathoplexus deployment infrastructure

You likely do have access like anyone else: by making PRs - but it's true that it's easier for me to do it as I know exactly which PRs to make.

But can you still check if you have permission to make such PRs? Would not harm if you knew how the deployment works, just in case!

It's quite easy:

I've now put it on staging, but the PRs above show how to do it :)

corneliusroemer added a commit to pathoplexus/loculus_deployments that referenced this pull request Oct 7, 2024
@theosanderson
Copy link
Member

(If Fabian had made a PR it also wouldn't have triggered an image build so would require a bit of additional work to deploy - just for info, not arguing one way or the other)

@corneliusroemer
Copy link
Contributor

(If Fabian had made a PR it also wouldn't have triggered an image build so would require a bit of additional work to deploy - just for info, not arguing one way or the other)

Thanks I wasn't sure about that - would just require triggering an action run, but as approval is necessary anyways this is not a real blocker either.

@fengelniederhammer
Copy link
Contributor Author

What's actually the benefit of testing on Pathoplexus? Why isn't testing on the Loculus preview enough?

@theosanderson
Copy link
Member

Just that the fields may be in a different order in PP - though on reflection they are probably not! Checking that the order in yaml is the same and testing the loculus preview would also work.

@theosanderson
Copy link
Member

Had a look at the preview

image

We should probably move accessionVersion to be next to accession and version on pathoplexus alongside merging this. Also the dataUseTermsUrl is split apart from the other data use terms.

@fengelniederhammer
Copy link
Contributor Author

I wonder whether we want to move versionStatus and versionComment next to version?

@corneliusroemer
Copy link
Contributor

What's actually the benefit of testing on Pathoplexus? Why isn't testing on the Loculus preview enough?

Loculus has only artificial data, lacks real users, real user submitted data etc - lots of things we can catch on PP that we can't on Loculus.

@anna-parker and I tested on staging:

  • Fields in tsv are ordered differently (as expected), but no value changes as verified with csv-diff
  • We tested that revoking makes it through as well as a manual test just because we can now test this

# Changelog

## [0.2.22](GenSpectrum/LAPIS-SILO@v0.2.21...v0.2.22) (2024-10-01)

### Bug Fixes

* correctly escape quotes in field names ([#595](GenSpectrum/LAPIS-SILO#595)) ([7e7b448](GenSpectrum/LAPIS-SILO@7e7b448))

## [0.2.21](GenSpectrum/LAPIS-SILO@v0.2.20...v0.2.21) (2024-09-20)

### Bug Fixes

* **preprocessing:** resolves spurious OOM crashes when handling large datasets ([6e4eae2](GenSpectrum/LAPIS-SILO@6e4eae2))

## [0.2.20](GenSpectrum/LAPIS-SILO@v0.2.19...v0.2.20) (2024-09-17)

### Features

* make duckdb memory limit configurable via preprocessing config ([c112eb1](GenSpectrum/LAPIS-SILO@c112eb1))

## [0.2.19](GenSpectrum/LAPIS-SILO@v0.2.18...v0.2.19) (2024-09-12)

### Bug Fixes

* preprocessing: validate that either a ndjson or tsv file was given as input ([#564](GenSpectrum/LAPIS-SILO#564)) ([7bf2ef7](GenSpectrum/LAPIS-SILO@7bf2ef7))

## [0.2.18](GenSpectrum/LAPIS-SILO@v0.2.17...v0.2.18) (2024-09-12)

### Features

* do not abort on assertment failures ([4e5f598](GenSpectrum/LAPIS-SILO@4e5f598))
* panic: add `ASSERT`, `UNREACHABLE`, `UNIMPLEMENTED`, safer env var ([867d08f](GenSpectrum/LAPIS-SILO@867d08f))
* update duckdb version ([577c1b2](GenSpectrum/LAPIS-SILO@577c1b2))

### Bug Fixes

* duplicate key in sample.ndjson ([c9b5232](GenSpectrum/LAPIS-SILO@c9b5232))
* increase duckdb memory limit to 80GB ([20131bb](GenSpectrum/LAPIS-SILO@20131bb))
* order of randomized tests changed ([d29b2f2](GenSpectrum/LAPIS-SILO@d29b2f2))
# Changelog

## [0.3.4](GenSpectrum/LAPIS@v0.3.3...v0.3.4) (2024-10-02)

### Features

* **lapis:** order CSV/TSV columns of aggregated/details data as specified in the request fields or as in the database config ([6cf8704](GenSpectrum/LAPIS@6cf8704)), closes [#910](GenSpectrum/LAPIS#910)

### Bug Fixes

* **lapis-docs:** fix link on landing page ([d4137b3](GenSpectrum/LAPIS@d4137b3))

## [0.3.3](GenSpectrum/LAPIS@v0.3.2...v0.3.3) (2024-09-25)

### Features

* **lapis:** abort on startup when silo url has invalid syntax ([373d662](GenSpectrum/LAPIS@373d662)), closes [#939](GenSpectrum/LAPIS#939)

### Bug Fixes

* **lapis-docs:** Sequence file naming scheme uses indexes now not names ([1e7cd60](GenSpectrum/LAPIS@1e7cd60))

## [0.3.2](GenSpectrum/LAPIS@v0.3.1...v0.3.2) (2024-09-10)

### Features

* **lapis:** add healthcheck to Docker containers ([0944990](GenSpectrum/LAPIS@0944990)), closes [#813](GenSpectrum/LAPIS#813)
* **lapis:** add instance name to landing page json ([6c5f81b](GenSpectrum/LAPIS@6c5f81b))
@corneliusroemer corneliusroemer merged commit 006c181 into main Oct 7, 2024
15 checks passed
@corneliusroemer corneliusroemer deleted the lapis034silo0222 branch October 7, 2024 12:38
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
deployment Code changes targetting the deployment infrastructure preview Triggers a deployment to argocd
Projects
None yet
3 participants